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EVALUATION 


The  objective  of  this  effort  was  to  investigate  a  technique  for 
automatically  measuring  the  intelligibility  of  speech  processed  over 
a  communications  channel.  A  breadboard  model  was  built  and  successfully 
demonstrated  the  ability  to  measure  speech  intelligibility. 

Several  efforts  have  attempted  to  produce  an  automated  speech  intelli¬ 
gibility  measurement  device,  but  very  few  have  produced  effective  hardware. 
The  technique  evaluated  here  promises  to  overcome  the  failures  of  many 
past  efforts.  The  results  presented  here  should  help  lead  to  a  final 
solution  of  the  problem  of  efficiently  and  objectively  evaluating  the 
performance  of  voice  communications  systems. 

DONALD  M.  OTTINGER 
Project  Engineer 
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1.  INTRODUCTION 

There  are  a  number  of  criteria  for  evaluating 
voice  modems.  These  include  Intelligibility  -  which 
refers  to  the  ability  to  understand  words.  Quality  - 
which  refers  basically  to  system  acceptance,  Listener 
Fatigue,  and  Speaker  Recognition.  Of  these,  the  most 
important  is  Intelligibility.  Many  tests  have  been 
developed  for  measuring  Intelligibility.  These  involve 
transmitting  words,  syllables,  phrases  or  sentences 
through  the  system,  and  having  a  panel  of  listeners  try 
to  identify  what  was  transmitted.  The  results  are 
scored  and  an  intelligibility  score  is  then  determined. 

Tv,  tests  that  have  found  wide  acceptance  are  the  Mod¬ 
ified  Rhyme  Test  and  the  PB-50  Test.  (An  excellent 
discussion  of  speech  intelligibility  and  quality  testing 
is  given  in  Section  8  of  Reference  (1).) 

Intelligibility  tests  are  expensive,  often  difficult 
to  reproduce,  and  sometimes  inconclusive.  To  get  around 
these  problems  a  number  of  analytic  and  semi -analytic 
techniques  have  been  developed.  An  example  of  an 
analytic  approach  to  the  problem  is  given  in  Reference  (2) . 
A  semi-analytic  technique  that  is  often  used  is  based  on 
the  Articulation  Index.  This  is  instrumented  in  such 
equipment  as  VIAS  or  SCIM  by  transmitting  a  "speech-like" 
test  signal  through  the  system  and  measuring  test  signal 
to  noise  at  the  output  in  a  number  of  bands.  The  Arti¬ 
culation  Index  is  then  computed  automatically  by  the 
equipment.  Where  test  tone  to  noise  measurements  have 
been  made  on  the  equipment.  Articulation  Index  can  be 
computed  through  the  use  of  weighting  of  test  tone  to 
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noise  as  a  function  of  the  particular  band.  The  method 
for  computing  the  articulation  index  is  given  in 
Reference  (3) .  Intelligibility  scores  can  then  be 
estimated  from  the  articulation  index. 

While  these  analytic  and  semi-analytic  techniques 
are  generally  less  expensive  and  more  repeatable  than  the 
intelligibility  test,  they  often  give  incorrect  results. 
The  analytic  and  semi-analytic  techniques  are  based  on 
certain  assumptions  of  the  nature  of  the  modem.  These 
assumptions  are  often  not  valid  and  there  is  considerable 
discrepancy  between  articulation  index  scores  and  intel¬ 
ligibility  scores  for  different  types  of  modems. 

The  techniques  based  on  test  tone  to  noise  meas¬ 
urements  or  articulation  index  were  developed  primarily 
for  use  on  essentially  linear  system  such  as  amplitude 
modulation  or  standard  frequency  modulation.  Many  modern 
systems  are  basically  non-linear  in  nature.  For  example, 
in  delta  modulation,  slope  overload  will  cause  a  high 
frequency  signal  component  to  be  lost  in  the  presence 
of  a  relatively  strong  low  frequency  component.  A  low 
level  signal  component  will  also  be  masked  by  the 
quantization  noise.  On  the  other  hand,  a  median  level 
high  frequency  signal  will  be  passed  well  by  the  delta 
modulation  system.  Various  speech  phonemes  fall  into 
each  of  these  categories,  namely,  some  are  median  level 
high  frequency  signals,  some  are  low  level  signals,  and 
some  are  made  up  of  relatively  strong  low  frequency 
signals  with  high  frequency  components.  The  techniques 
based  on  test  tone  to  noise  measurements  or  articulation 
index  do  not  take  these  facts  into  account  and  therefore 
give  erroneous  results. 


Another  example  of  a  non-linear  technique  that  is 
used  in  many  modern  systems  to  improve  intelligibility 
at  low  C/NQ  is  speech  signal  compression.  Signal 
compression  enhances  low  level  phonemes,  but  introduces 
high  frequency  components  that  may  not  have  been  present 
in  the  original  signal.  A  technique  that  does  not  take 
both  of  these  facts  into  account  will  not  yield  a  correct 
score. 

Certain  phonemes,  in  particular,  the  stop  consonants 
such  as  "t"  or  "p"  depend  for  recognition  on  the  time 
history  of  the  sound  at  least  as  much  as  they  do  on  the 
frequency  composition.  A  number  of  voice  communication 
systems  do  not  perform  well  for  these  types  of  phonemes. 
These  include  some  of  the  vocoders  when  operated  at  low 
data  rate,  and  some  of  the  TASI  type  systems.  The 
intelligibility  measuring  techniques  based  on  articulation 
index  do  not  take  time  history  into  account,  and  thus 
give  erroneous  results  for  these  voice  communication 
systems. 

A  system  is  described  in  this  report  which,  by 
using  actual  speech  phonemes,  overcomes  many  of  the 
problems  that  occur  in  existing  intelligibility  meas¬ 
uring  instruments.  In  particular  the  system  is  based  on 
the  use  of  a  test  tape  on  which  is  recorded  a  set  of  PB 
words.  The  starting  and  ending  phonemes  of  each  word 
are  located  in  time  by  means  of  precise  timing  signals. 
Prior  to  the  use  of  the  tape  for  the  evaluation  of 
speech  modems,  the  tape  is  played  through  a  signal  anal¬ 
ysis  unit  which  measures  for  each  of  the  phonemes,  the 
energy  content  in  a  set  of  selected  frequency  levels  and 
during  a  set  of  selected  time  periods.  Thus  a  time- 
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frequency  matrix  is  obtained  which  is  characteristic  of 
each  phor  me.  This  time  frequency  matrix  which  serves 
as  a  reference  is  stored  in  a  microcomputer. 

The  tape  can  then  be  played  thru  a  modem  system  for 
which  an  intelligibility  score  is  desired  .  The  output 
of  the  system  is  then  recorded  on  tape.  The  resulting 
tape  is  then  played  through  the  signal  analyzer  unit 
which  obtains  a  new  time  frequency  matrix  for  each 
starting  and  ending  phoneme.  In  obtaining  the  time 
frequency  matrix  the  precise  timing  signal  is  used  to 
insure  that  the  new  time  frequency  matrix  is  obtained  at 
exactly  the  same  time  as  the  reference  time  frequency 
matrix.  The  new  time  frequency  matrix,  which  is  subject 
to  the  noise  and  distortion  of  the  modem,  is  fed  to  the 
microcomputer,  where  it  is  compared  with  the  reference, 
and  an  intelligibility  measure  for  each  phoneme  is 
computed.  The  intelligibility  score  is  then  computed  by 
averaging  with  suitable  weighting  over  all  of  the 
phonemes . 

The  instrumentation  for  voice  intelligibility 
measurement  described  in  this  report  promises  performance 
which  will  be  superior  to  that  of  existing  techniques. 

The  most  important  reason  for  the  improved  performance 
is  the  fact  that  actual  speech  phonemes  are  used,  so 
that  the  effect  of  noise  or  distortion  is  more  realis¬ 
tically  taken  into  account.  Since  actual  speech  sounds 
are  used,  the  signal  statistics  are  realistic,  and  the 
instrumentation  is  usable  for  measuring  performance  of 
modern  modem  systems  which  are  based  on  speech  signal 
statistics.  Further,  the  system  is  sufficiently 
flexible  so  as  to  permit  modifications  to  be  made  in  the 


intelligibility  estimation  algorithms  as  new  knowledge 
is  obtained  concerning  speech  intelligibility. 
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2.  SYSTEM  DESCRIPTION 

2.1  Overall  Description.  The  complete  system 
consists  of  two  major  components:  the  test  tape  and  the 
scoring  system.  The  test  tape  contains  a  set  of  50 
phonetically  balanced  words  each  of  which  is  preceded  by 
a  timing  code.  A  block  diagram  of  the  scoring  system  is 
shown  in  Figure  1. 

Prior  to  delivery,  information  concerning  the 
frequency  content  of  the  starting  and  ending  phoneme  of 
each  of  the  50  words  is  stored  in  the  computer  memory  for 
reference.  This  is  accomplished  in  the  following  manner. 
The  test  tape  is  played  through  the  system.  Under 
computer  control  the  code  preceding  each  word  is  decoded, 
and  the  computer  then  times  to  the  first  sample  measure¬ 
ment.  This  time  is,  in  general,  different  for  each  word, 
and  has  been  previously  established  by  analysis.  At  the 
sample  time  the  content  of  each  of  the  filters  shown  in 
Figure  1  is  sampled  successively  by  the  multiplexer. 

Each  of  the  12  filter  samples  is  converted  to  an  8  bit 
digital  word  and  is  stored  in  computer  memory.  The 
computer  then  times  to  the  next  sample  and  the  filter 
sampling  process  is  repeated.  The  number  of  sample 
measurements  per  phoneme  is,  in  general,  different  for 
each  phoneme.  This  number  has  been  determined  by  analysis 
and  is  stored  in  computer  memory.  The  entire  process  is 
repeated  for  each  word  starting  in  each  case  with  the 
decoding  of  the  timing  code.  At  the  end  of  this  calibra¬ 
tion,  the  computer  memory  contains  the  following  informa¬ 
tion:  the  number  of  samples  in  each  phoneme  for  which 
measurements  are  to  be  made;  the  time  from  the  timing 
code  to  each  sample;  and  the  frequency  content  of  each 
sample . 
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Block  Diagram  -  Intelligibility  Scoring  System 


To  evaluate  a  communications  system,  the  test  tape 
is  played  through  the  communication  system  and  a  new  tape 
made  from  the  output.  The  new  tape  contains  the  timing 
codes  and  words  of  the  original  test  tape  with  whatever 
distortion  and  noise  has  been  added  by  the  communication 
system. 

The  new  tape  is  then  played  through  the  Intel¬ 
ligibility  Scoring  System.  The  operation  is  now  similar 
to  that  for  calibration.  The  timing  code  is  decoded,  the 
system  times  to  each  sample  time,  and  at  each  sample  time 
it  measures  and  stores  in  memory  the  contents  of  each  of 
the  12  filters.  Because  of  the  noise  and  distortion 
introduced  by  the  communication  system,  the  filter 
content  will,  in  general,  be  different  from  that  obtained 
with  the  original  test  tape. 

After  each  of  the  50  words  has  been  analyzed  as 
described  above,  an  intelligibility  measure  is  obtained 
for  each  phoneme,  an  intelligibility  score  is  obtained 
for  each  word,  and  the  word  intelligibility  scores  are 
then  averaged  to  obtain  an  overall  intelligibility  score 
for  the  communication  system. 

The  intelligibility  measure  is  a  simple  distance 
measure,  and  is  computed  as  follows: 

(albi  +  a2b2  + . 

Intelligibility  =  - - - - - 

Measure  (al  +  a2^  +  ♦  •  •  • )  +  b 2 

where:  a^  is  the  reference  content  of  the  it^1  filter 

obtained  during  the  calibration. 
bj_  is  the  content  ifc^  filter  obtained  during  the 
test  run. 
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The  summation  is  over  the  12  filters  if  only  a  single 
sample  is  measured  for  the  phoneme.  If  more  than  one 
sample  is  measured  for  the  phoneme,  the  summation  will  be 
over  24  or  36  depending  on  whether  two  or  three  samples 
are  measured  for  the  phoneme . 

The  intelligibility  score  for  each  phoneme  is 
obtained  from  a  linear  relationship  between  intelligibil¬ 
ity  score  and  intelligibility  measure.  The  parameters 
for  this  linear  relationship  were  obtained  by  comparing 
the  test  scores  obtained  with  human  subjects  for  each 
phoneme  with  the  corresponding  intelligibility  measure 
for  that  phoneme.  In  general,  the  relationship  is 
different  for  each  phoneme.  An  intelligibility  score  for 
the  word  is  obtained  by  selecting  the  lower  of  the  two 
intelligibility  scores  obtained  for  the  starting  and 
ending  phonemes. 

The  overall  intelligibility  score  is  obtained  by 
simply  averaging  over  the  word  intelligibility  scores. 


2.2  Test  Tape.  As  mentioned  previously,  the  test 
tape  contains  a  set  of  50  phonetically  balanced  words 
each  of  which  is  preceded  by  a  timing  code.  The  timing 
code  consists  of  a  1  KHz  tone  which  is  modulated  by  a 
pulse  pattern  as  shown  in  Figure  2 . 


i  i 


t  r 


10msec 


Figure  2.  Timing  Code  Pulse  Pattern 
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This  code  has  a  number  of  properties  which  make  it 
useful  for  this  application.  It  can  provide  accurate 
timing  because  it  has  good  autocorrelation  properties. 

The  autocorrelation  function  is  4  for  zero  shift  and  is 
no  higher  than  1  for  any  pulse  shift  other  than  zero. 

Thi  code  has  sufficient  energy  to  insure  good  decoding 
under  signal  noise  ratios  lower  than  those  for  which 
usable  speech  communication  can  be  obtained.  The  fre¬ 
quency  components  of  the  code  being  centered  on  1  KHz  are 
such  that  they  will  pass  any  reasonable  speech  communica¬ 
tion  system. 

Tests  conducted  on  the  code  with  decoder  circuitry 
and  algorithm  described  elsewhere  in  the  report  yielded 
the  following  results: 

1.  Reliable  decoding  was  obtained  with  code 
signal  amplitudes  between  2  volts  and  12  volts. 

2.  Reliable  decoding  in  noise  was  obtained  at  a 
value  of  code  signal/noise  spectral  density  of 
37  dbHz.  Approximately  50%  decoding  was 
obtained  at  a  value  of  code  signal/noise 
spectral  density  of  31  dbHz.  (This  is  well 
below  the  value  at  which  usable  speech  intel¬ 
ligibility  is  obtained.) 

3.  Total  jitter  in  decode  delay  was  +2.5  msec 
under  all  conditions. 

2.3  System  Hardware.  In  this  section  a  detailed 
description  is  presented  of  each  of  the  major  blocks 
shown  in  the  block  diagram  of  Figure  1. 
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2.3.1  Pre-Amplifier.  A  schematic  of  the  pre-am¬ 
plifier  is  shown  in  Figure  3.  The  purpose  of  the  pre¬ 
amplifier  is  to  amplify  the  signal  from  the  tape  rec¬ 
order  to  a  level  suitable  for  driving  the  filters.  As 
shown,  the  preamplifier  consists  of  two  stages  with  an 
overall  gain  from  the  audio  input  of  30.  A  volume 
control  is  provided  to  permit  adjusting  the  signal  level 
out  of  the  pre-amplifier  to  the  correct  value  of  approx¬ 
imately  6  volts  peak.  It  should  be  noted  that,  because 
of  the  normalizing  properties  of  the  intelligibility 
measure  algorithm,  the  adjustment  is  not  critical.  In 
addition  to  the  audio  input  which  is  normally  used,  a 
second  input  with  switchable  gain  is  provided  to  permit 
adding  noise  for  system  test  and  evaluation. 

2.3.2  Filters.  As  shown  in  Figure  1,  twelve 
filters  are  provided,  each  with  a  different  center 
frequency.  Each  filter  has  a  bandwidth  approximately  20% 
of  the  center  frequency.  A  schematic  of  the  filter  cir¬ 
cuitry  is  shown  in  Figure  4.  The  filter  proper  is  made 
up  of  the  first  two  stages  which  are  stagger  tuned,  the 
first  stage  5%  below  the  center  frequency,  the  second 
stage  5%  above  the  center  frequency.  The  tuning  of  the 
filters  is  accomplished  by  selection  of  the  resistors, 

Ral'  Ra2'  Ra3'  Rbl'  Rb2'  Rb3  anc*  caPacitors  •  C.  The 
The  values  of  these  resistors  and  capacitors  for  each 
filter  are  given  in  Table  1.  Stagger  tuning  provides  a 
relatively  flat  response  in  the  passband  of  the  filter 
with  good  rejection  outside  the  passband.  A  sample 
response  for  the  1  KHz  filter  is  shown  in  Figure  5. 

The  third  and  fourth  stage  make  up  a  full-wave 
rectifier  and  single  pole  low  pass  filter.  The  fifth 
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Figure  4.  Filter  Schematic 


Table  1.  Selected  Values  for  Filter  Components 


fo 

^oa 

fob 

C 

R1 

R2 

R3 

250 

237.5 

.047 

68K 

470 

300K 

262.5 

.047 

62K 

390 

27  OK 

315 

299.25 

.047 

56K 

330 

2  2  OK 

330.75 

.047 

51K 

270 

200K 

400 

380 

.047 

4  3K 

220 

18  OK 

420 

.  047 

39K 

150 

16  OK 

500 

475 

.  047 

36K 

100 

15  OK 

525 

.  047 

33K 

82 

13  OK 

630 

598.5 

.01 

13  OK 

1000 

510K 

661.5 

.01 

120K 

1000 

4  7  OK 

800 

760 

.01 

100K 

820 

4  3  OK 

840 

.01 

91K 

680 

3  9  OK 

1000 

950 

• 

o 

82K 

560 

3  3  OK 

1050 

.01 

75K 

47  0 

300K 

1250 

1187.5 

.01 

68K 

390 

2  7  OK 

1312.5 

.01 

62K 

330 

24  OK 

1600- 

1520 

.01 

51K 

270 

20  OK 

1680 

.01 

47K 

220 

18  OK 

2000 

1900 

.  0047 

91K 

680 

360K 

2100 

.0047 

82K 

560 

3  3  OK 

2500 

2375 

.0047 

68K 

470 

300K 

2625 

.0047 

62K 

390 

27  0K 

3150 

2992.5 

.0047 

56K 

330 

2  2  OK 

3307.5 

.0047 

51K 

270 

20  OK 

4000 

3800 

.0047 

43K 

220 

18  OK 

4200 

.0047 

39K 

150 

160K 

4200 


stage  is  used  for  a  two  pole  low  pass  filter.  Thus  the 
overall  low  pass  response  is  that  of  a  three  pole  filter. 
Each  pole  is  tuned  to  approximately  11  Hz.  The  use  of 
this  three  pole  filter  provides  good  smoothing  with 
essentially  independent  measurements  when  the  meas¬ 
urements  are  separated  by  50  milliseconds  or  more. 

2.3.3  Code  Detector.  The  code  detector  works  with 
the  1  KHz  filter  (as  shown  in  Figure  1)  to  provide  a 
demodulated  signal  to  the  computer  for  accurate  decoding. 
A  schematic  of  the  code  detector  is  shown  in  Figure  6. 

The  signal  to  the  code  detector  is  taken  from  the  second 
stage  of  the  lKHz  filter.  As  shown  the  first  two  stages 
of  the  code  detector  constitute  a  full  wave  rectifier  and 
low  pass  filter.  The  response  time  of  this  low  pass 
filter  is  faster  than  that  of  the  1  KHz  filter  in  order 
to  insure  accurate  timing.  The  final  stage  of  the  code 
detector  is  a  comparator  which  provides  a  signal  at  0 
volts  in  the  absence  of  a  code  pulse  and  +5  volts  in  the 
presence  of  a  code  pulse.  These  signals  are  suitable  for 
driving  the  computer  for  decoding. 

2.3.4  Multiplexer  and  A/D  Converter.  The  multi¬ 
plexer  and  A/D  converter  is  a  commercial  unit,  the  AIM16, 
built  by  Connecticut  Microcomputer,  Inc.  The  AIM16  is 
capable  of  selecting  1  of  16  analog  inputs  in  response 
to  a  digital  multiplex  address.  The  selected  analog 
input  signal  is  converted  to  an  8  bit  digital  signal 
which  is  available  to  the  computer.  The  conversion  time 
is  less  than  100  microseconds.  The  input  voltage  range 
is  0  to  5.12  volts  which  is  converted  to  a  count  between 


16 


o  and  255  (00  to  FF  hex).  Resolution  is  thus  20  milli¬ 
volts  per  count. 

2.3.5  Computer.  The  computer  used  is  the  AIM65 
built  by  Rockwell  International  Corporation.  It  is  based 
on  the  6502  microprocessor.  In  addition  to  the  4K  of  RAM 
memory  resident  on  the  computer  board,  an  additional  8K 
of  RAM  memory  was  added  to  permit  adequate  storage  of 
programs  and  data.  The  computer  permits  very  simple 
interface  to  external  circuitry  through  a  set  of  four 
8  bit  I/O  ports,  and  thus  was  particularly  useful  for 
this  application  where  such  ease  of  interfacing  was  an 
important  consideration.  The  computer  can  be  programmed 
in  machine  language  and  in  Basic  through  the  use  of  a 
Basic  RAM. 

2.4  System  Software.  The  system  software  is 
made  up  of  a  number  programs  written  both  in  Basic  and 
machine  language.  The  master  program,  written  in  Basic, 
is  SCOR2 .  This  in  turn  calls  as  a  subroutine  STOR8  which 
is  written  in  machine  language.  STOR8  calls  as  subrou¬ 
tines  DC0D5  and  SAMP3.  Listings  for  these  programs  are 
given  in  Tables  2,  3,  4  and  5. 

DC0D5  performs  the  decoding  of  the  timing  code.  It 
samples  the  output  of  the  code  detector  every  50  milli¬ 
seconds.  The  successive  outputs  are  shifted  into  memory 
so  that  19  successive  time  samples  are  stored.  These 
memory  locations  are  then  examined  for  the  presence  of  a 
proper  code  by  adding  the  numbers  stored  in  locations 
corresponding  to  the  presence  of  code  pulses  and  sub¬ 
tracting  the  numbers  corresponding  to  the  absence  of  code 
pulses.  If  this  process  yields  3  or  more  a  code  is 
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declared  to  be  present.  This  permits  1  pulse  of  the  code 
to  be  lost  due  to  noise,  or  it  permits  one  false  pulse  to 
occur . 

SAMP 3  addresses  the  multiplexer  and  A/D  converter 
so  as  to  read  successively  the  contents  of  the  12  filters. 
These  are  stored  in  successive  memory  locations. 

ST0R8  performs  the  initialization  of  various  memory 
locations  as  required,  it  calls  DC0D5 ,  and  after  a  timing 
code  is  detected,  it  times  to  the  first  sample  and  then 
times  to  succeeding  samples  of  a  word.  At  each  sample 
time  it  calls  SAMP3  to  read  the  contents  of  the  12  filters. 
The  timing  between  samples  and  the  number  of  samples  per 
phoneme  have  been  determined  for  each  word  by  analysis 
of  the  frequency  content  as  a  function  of  time  for  each 
word.  A  listing  of  the  number  of  samples  used  in  the 
leading  and  ending  phoneme  of  each  word  is  given  in 

Table  6.  A  listing  of  the  times  to  the  samples  is  given 

in  Table  7.  The  numbers  in  Table  7  are  multiples  of  25 

milliseconds  and  are  given  in  hex.  The  first  number  is 

the  time  from  the  code  to  the  first  sample,  the  second 
number  is  the  time  to  the  second  sample,  the  third  number 
is  the  time  to  the  third  sample,  the  fourth  number  is  the 
time  to  the  first  sample  of  the  ending  phoneme,  the  fifth 
number  is  the  time  to  the  second  sample  of  the  ending 
phoneme,  and  the  sixth  number  is  the  time  to  the  third 
sample  of  the  ending  phoneme.  The  hex  number  00  indicates 
that  there  is  no  sample.  The  numbers  in  Table  7  are  in 
groups  of  6  for  each  word. 

SCOR2  calls  ST0R8  as  a  subroutine  and  then  performs 
the  necessary  calculations  to  determine  an  intelligibility 
score.  It  is  useful  to  review  the  function  of  some  of  the 
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lines  of  SC0R2  to  describe  the  program  operation.  Lines 
20  and  30  identify  and  call  STOR8 .  Lines  160  through  290 
compute  the  intelligibility  measure  (A)  for  each  phoneme. 
Line  120  looks  up  in  memory  the  number  of  samples  to  be 
used  in  that  computation.  Line  330  computes  the  intel¬ 
ligibility  score  (TS (M) )  for  each  phoneme.  Based  on  the 
intelligibility  measure  (A)  and  the  parameters  AH  and  AL. 
AH  and  AL  are,  in  general,  different  for  each  phoneme 
and  are  found  in  lines  310  and  320.  A  listing  of  AH  and 
AL  is  given  Table  8.  Lines  332  through  339  insure  that 
the  intelligibility  score  can  never  exceed  1  or  be  less 
than  0.  Lines  360  through  390  select  the  lowest  intel¬ 
ligibility  score  for  each  word.  Line  410  computes  a 
running  average  of  the  intelligibility  score. 
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Table  2 


3C00 

A2 

LDX 

#13 

3C02 

A9 

LDA 

#00 

3C04 

9D 

STA 

3C60 ,X 

3C07 

CA 

DEX 

3C08 

10 

BPL 

3C02 

3C0A 

A9 

LDA 

#7C 

3C0C 

8D 

STA 

A008 

3C0F 

A9 

LDA 

#13 

3C11 

8D 

STA 

A009 

3C14 

A2 

LDX 

#12 

3C16 

BD 

LDA 

3C60,X 

3C19 

9D 

STA 

3C61,X 

3C1C 

CA 

DEX 

3C1D 

10 

BPL 

3C16 

3C1F 

AD 

LDA 

A000 

3C22 

29 

AND 

#40 

3C24 

4A 

LSR 

-A 

3C25 

4A 

LSR 

-A 

3C26 

4A 

LSR 

-A 

3C27 

4A 

LSR 

.A 

3C28 

4A 

LSR 

.A 

3C29 

4A 

LSR 

.A 

3C2A 

8D 

STA 

3C60 

3C2D 

18 

CLC 

3C2E 

A9 

LDA 

#00 

3C30 

6D 

ADC 

3C63 

3C33 

6D 

ADC 

3C6B 

3C36 

6D 

ADC 

3C6F 

3C39 

6D 

ADC 

3C71 

3C3C 

38 

SEC 

3C3D 

ED 

SBC 

3C61 

3C40 

ED 

SBC 

3C65 

3C43 

ED 

SBC 

3C67 

3C46 

ED 

SBC 

3C69 

3C49 

ED 

SBC 

3C6D 

3C4C 

ED 

SBC 

3C73 

3C4F 

E9 

SBC 

#03 

3C51 

10 

BPL 

3C5C 

3C53 

A9 

LDA 

#20 

3C55 

2C 

BIT 

AOOD 

3C58 

F0 

BEQ 

3C55 

3C5A 

DO 

BNE 

3C0A 

3C5C 

60 

RTS 

DC0D5 


21 


Table  3.  SAMP 3 


3C80 

A9 

LDA 

#0C 

3C82 

A8 

TAY 

3C83 

09 

ORA 

#10 

3C85 

8D 

STA 

3CB0 

3C88 

88 

DEY 

3C8  9 

A9 

LDA 

#20 

3C8B 

8D 

STA 

A000 

3C8E 

AD 

LDA 

3CB0 

3C91 

8D 

STA 

A000 

3C94 

A2 

LDX 

#49 

3C96 

CA 

DEX 

3C97 

10 

BPL 

3C96 

3C99 

A9 

LDA 

#20 

3C9B 

4D 

EOR 

AOOO 

3C9E 

8D 

STA 

A000 

3CA1 

AD 

LDA 

AOOF 

3CA4 

99 

STA 

27BC ,  Y 

3CA7 

CE 

DEC 

3CB0 

3CAA 

88 

DEY 

3  CAB 

10 

BPL 

3C89 

3  CAD 

60 

RTS 
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Table  4 .  ST0R8 


3D00 

A9 

LDA 

#00 

3D7E 

8D 

STA 

A009 

3D02 

80 

STA 

A003 

3D81 

A9 

LDA 

#20 

3D05 

80 

STA 

AOOB 

3D83 

2C 

BIT 

AOOD 

3D08 

8D 

STA 

3DF8 

3D86 

FO 

BEQ 

3D83 

3D0B 

80 

STA 

3DF1 

3D88 

A9 

LDA 

#00 

3D0E 

A9 

LDA 

#BF 

3D8A 

8D 

STA 

AOOO 

3D10 

8D 

STA 

A002 

3D8D 

20 

JSR 

3C80 

3D13 

A9 

LDA 

#00 

3D90 

EE 

INC 

3DF8 

3D15 

8D 

STA 

3CA5 

3D93 

AD 

LDA 

3DF8 

3D18 

A9 

LDA 

#30 

3D96 

09 

CMP 

#06 

3D1A 

8D 

STA 

3CA6 

3D98 

FO 

BEQ 

3DB6 

3D1D 

A9 

LDA 

#00 

3D9A 

AE 

LDX 

3DF8 

3D1F 

8D 

STA 

3D47 

3D9D 

BD 

LDA 

2D26,X 

3D22 

8D 

STA 

3D9E 

3DA0 

FO 

BEQ 

3D90 

3D25 

A9 

LDA 

#2C 

3DA2 

18 

CLC 

3D27 

8D 

STA 

3D48 

3DA3 

AD 

LDA 

3CA5 

3D2A 

8D 

STA 

3D9F 

3DA6 

69 

ADC 

#0  C 

3D2D 

20 

JSR 

3D40 

3DA8 

8D 

STA 

3CA5 

3D30 

60 

RTS 

3  DAB 

AD 

LDA 

3CA6 

3D31 

4C 

JMP 

3D00 

3  DAE 

69 

ADC 

#00 

3DB0 

8D 

STA 

3CA6 

3D4  0 

20 

JSR 

3C00 

3DB3 

4C 

JMP 

3D43 

3d4  3 

AE 

LDX 

3DF8 

3DB6 

EE 

INC 

3DF1 

3D46 

BD 

LDA 

2D26  ,X 

3DB9 

AD 

LDA 

3DF1 

3D4  9 

8D 

STA 

3DF2 

3DBC 

C9 

CMP 

#32 

3D4C 

A9 

LDA 

#00 

3DBE 

FO 

BEQ 

3DF0 

3D4E 

8D 

STA 

3DF3 

3DC0 

A9 

LDA 

#00 

3D51 

A9 

LDA 

#80 

3DC2 

BD 

STA 

3DF8 

3D53 

8D 

STA 

A008 

3DC5 

18 

CLC 

3D56 

A9 

LDA 

#13 

3DC6 

AD 

LDA 

3CA5 

3058 

8D 

STA 

A009 

3DC9 

69 

ADC 

#0C 

3D5B 

A9 

LDA 

#20 

3DCB 

8D 

STA 

3CA5 

3D5D 

2C 

BIT 

AOOD 

3DCE 

AD 

LDA 

3CA6 

3D60 

FO 

BEQ 

3D5D 

3DD1 

69 

ADC 

#00 

3062 

EE 

INC 

3DF3 

3DD3 

8D 

STA 

3CA6 

3D65 

AD 

LDA 

3DF3 

3DD6 

18 

CLC 

3068 

CD 

CMP 

3DF2 

3DD7 

AD 

LDA 

3D47 

3D6B 

DO 

BNE 

3D51 

3DDA 

69 

ADC 

#06 

3060 

AD 

LDA 

3DF8 

3DDC 

8D 

STA 

3D47 

3070 

DO 

BNE 

3D8D 

3DDF 

8D 

STA 

3D9E 

3072 

A9 

LDA 

#80 

3DE2 

AD 

LDA 

3D48 

3074 

8D 

STA 

AOOO 

3DE5 

69 

ADC 

#00 

3077 

A9 

LDA 

#80 

3DE7 

8D 

STA 

3D48 

3D79 

8D 

STA 

A008 

3DEA 

8D 

STA 

3D9F 

307C 

A9 

LDA 

#13 

3DED 

4C 

JMP 

3D40 

3DF0 

60 

RTE 
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Table  5.  SC0R2 


10  DIM  TS{2)  370 

20  POKE  04,  00:  POKE  05,  61  380 

30  Y=USR ( 0)  390 

40  IN=0  410 

50  NS=12032 

60  XS=8192  420 

70  YS=12288 

80  AS=11776  430 


90  FOR  L=0TO49 
100  PRINT!  "WORD  #"L  +  1 
110  FOR  M=0TOl 
120  N=PEEK  (NS+2*L+M) 

130  S=0 

140  SX=0 

150  SY=0 

160  FOR  K=lTON 

170  FOR  J=0TO2 

180  FOR  I=0TO3 

190  X=PEEK  (XS+I) 

200  Y=PEEK  (YS+I) 

210  S=S+X*Y 

220  SX=SX+X*X 

230  SY=SY+Y*Y 

240  NEXT  I 

250  XS=XS+4 

260  YS=YS+4 

270  NEXT  J 

280  NEXT  K 

290  A=(S*S/SX)/SY 

300  PRINT!  INT  (A*100)/100 

310  AH=PEEK  (AS+4*L+2*M) 

320  AL=PEEK  (AS+4*L+2*M+1) 

330  TS  (M) =10*A/ (AH-AL) -AL/ (AH-AL) 
332  IF  TS (M) >1  THEN  337 
334  IF  TS (M) 70  THEN  339 

336  GOTO  340 

337  TS (M) =1 

338  GOTO  340 

339  TS (M) =0 

349  PRINT! INT  (TS (M) *100) /100 

350  NEXT  M 

360  IF  TS (0}>TS (1)  THEN  390 


TS  (2)  =TS  (0) 

GOTO  410 
TS ( 2 )  =  TS (1) 

IN= ( IN*L+TS ( 2 ) )/ 
(L+l) 

PRINT! INT (IN* 100)/ 
100 

NEXT  L 
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Table  6.  Number  of  Samples  per  Phoneme 


2F00 

02 

02 

02 

02 

2F04 

02 

02 

02 

02 

2F08 

02 

02 

02 

02 

2F0C 

02 

02 

02 

02 

2F10 

02 

02 

02 

02 

2F14 

02 

02 

02 

03 

2F18 

01 

02 

01 

02 

2F1C 

02 

02 

02 

02 

2F20 

02 

01 

01 

02 

2F24 

01 

01 

02 

02 

2F28 

02 

02 

02 

02 

2F2C 

01 

01 

01 

02 

2F30 

01 

02 

01 

01 

2F34 

02 

02 

02 

01 

2F38 

02 

01 

02 

02 

2F3C 

02 

01 

02 

01 

2F40 

02 

01 

02 

02 

2F44 

01 

01 

01 

02 

2F48 

02 

01 

01 

01 

2F4C 

02 

01 

01 

01 

2F50 

01 

02 

01 

02 

2F54 

01 

02 

01 

01 

2F58 

01 

02 

01 

02 

2F50 

01 

02 

01 

02 

2F60 

02 

02 

01 

03 

25 


— .JilUI'l  ..  J. 


Table  7.  Times  to  Sample  Measurements 


2C00 

31 

13 

00 

40 

2C98 

00 

5E 

00 

00 

2C04 

13 

00 

4E 

09 

2C9C 

96 

13 

00 

59 

2C08 

00 

59 

13 

00 

2CA0 

27 

00 

78 

13 

2C0C 

75 

09 

00 

54 

2CA4 

00 

4A 

00 

00 

2C10 

09 

00 

B4 

09 

2CA8 

A9 

ID 

00 

4F 

2C14 

00 

40 

13 

00 

2CAC 

00 

00 

83 

13 

2C18 

76 

09 

00 

81 

2CB0 

00 

59 

13 

00 

2C1C 

09 

00 

BD 

09 

2CB4 

90 

13 

00 

40 

2C20 

00 

4F 

13 

00 

2CB8 

00 

00 

9E 

13 

2C24 

A2 

09 

00 

68 

2CBC 

00 

4A 

00 

00 

2C28 

09 

00 

BB 

09 

2CC0 

B7 

09 

00 

4A 

2C2C 

00 

4F 

13 

00 

2CC4 

09 

00 

32 

13 

2C30 

AO 

09 

00 

59 

2CC8 

00 

45 

13 

00 

2C34 

2C 

00 

B4 

0E 

2CCC 

97 

00 

00 

5E 

2C38 

00 

31 

09 

00 

2C  DO 

00 

00 

90 

00 

2C3C 

7E 

09 

00 

40 

2CD4 

00 

45 

13 

00 

2C40 

09 

00 

5B 

09 

2CD8 

9B 

13 

00 

36 

2C44 

00 

45 

13 

09 

2CDC 

00 

00 

D3 

00 

2C48 

72 

00 

00 

6D 

2CE0 

OO 

4F 

00 

00 

2C4C 

09 

00 

DB 

00 

2CE4 

7A 

09 

00 

3B 

2C50 

00 

86 

09 

00 

2CE8 

00 

00 

92 

00 

2C54 

BB 

09 

00 

86 

2CEC 

00 

86 

00 

00 

2C58 

09 

00 

B5 

09 

2CF0 

DE 

00 

00 

36 

2C5C 

00 

4A 

13 

00 

2CF4 

13 

00 

86 

00 

2C60 

A5 

09 

00 

72 

2CF8 

00 

40 

13 

00 

2C64 

00 

00 

CD 

00 

2CFC 

87 

00 

00 

45 

2C68 

00 

6D 

31 

00 

2D00 

13 

00 

84 

00 

2C6C 

C4 

00 

00 

4F 

2D04 

00 

40 

00 

00 

2C70 

00 

00 

6F 

09 

2D08 

A7 

00 

00 

40 

2C74 

00 

54 

13 

00 

2D0C 

13 

00 

EA 

00 

2C78 

80 

13 

00 

59 

2D10 

00 

45 

09 

00 

2C7C 

27 

00 

B6 

09 

2D14 

97 

00 

00 

4A 

2C80 

00 

45 

09 

00 

2D18 

09 

00 

A9 

00 

2C84 

9B 

00 

00 

8B 

2D1C 

00 

54 

ID 

00 

2C88 

00 

00 

CD 

00 

2D20 

9C 

09 

00 

68 

2C8C 

00 

54 

40 

00 

2D24 

09 

00 

D5 

00 

2C90 

AB 

00 

00 

45 

2D28 

00 

68 

13 

09 

2C94 

4A 

00 

B1 

00 

2D2C 

11 

52 

19 

50 

26 
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Table  8.  Parameters  for  Converting  Intelligibility 
Measure  to  Intelligibility  Score 


aH 

al 

aH 

aL 

aH 

aL 

ah 

al 

2E00 

09 

04 

05 

03 

2E98 

09 

04 

OA 

03 

2E04 

0A 

04 

0A 

04 

2E9C 

07 

02 

09 

03 

2E08 

05 

03 

04 

02 

2EA0 

OA 

08 

OA 

03 

2E0C 

05 

02 

07 

03 

2EA4 

08 

05 

06 

03 

2E10 

09 

01 

08 

01 

2EA8 

05 

03 

07 

03 

2E14 

09 

02 

09 

04 

2EAC 

07 

02 

OA 

04 

2E18 

09 

02 

09 

03 

2EB0 

08 

06 

08 

05 

2E1C 

08 

02 

08 

02 

2EB4 

08 

05 

08 

03 

2E20 

04 

01 

0A 

03 

2EB8 

07 

04 

08 

03 

2E24 

09 

04 

OA 

03 

2EBC 

OA 

03 

OA 

03 

2E28 

08 

05 

09 

04 

2EC0 

04 

02 

06 

02 

2E2C 

09 

02 

07 

02 

2EC4 

07 

05 

04 

02 

2E30 

04 

03 

07 

03 

2E34 

04 

02 

06 

03 

2E38 

09 

04 

07 

02 

2E3C 

04 

02 

09 

05 

2E4  0 

03 

02 

04 

01 

2E44 

06 

02 

06 

03 

2E48 

09 

05 

06 

03 

2E4C 

07 

02 

06 

03 

2E50 

09 

01 

06 

01 

2E54 

06 

03 

05 

03 

2E58 

08 

04 

07 

03 

2E5C 

04 

02 

OA 

06 

2E60 

05 

02 

09 

02 

2E64 

0A 

04 

08 

01 

2E68 

09 

02 

04 

02 

2E6C 

06 

04 

07 

02 

2E70 

0A 

07 

08 

01 

2E74 

05 

02 

05 

03 

2E78 

0A 

03 

08 

02 

2E7C 

06 

04 

09 

04 

2E80 

06 

02 

07 

05 

2E84 

09 

03 

09 

03 

2E88 

09 

07 

07 

02 

2E8C 

06 

04 

05 

04 

2E90 

05 

04 

08 

03 

2E94 

06 

05 

07 

04 
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3. 


TEST  RESULTS 

In  order  to  permit  calibration  and  evaluation 
of  the  system,  intelligibility  tests  were  conducted  with 
human  subjects.  The  word  list  used,  which  is  the  same  as 
that  recorded  on  the  test  tape,  is  shown  in  Table  9. 

From  the  master  tape,  additional  tapes  were  pre¬ 
pared  with  increasing  noise.  These  tapes  were  inter¬ 
mixed  with  other  word  tapes  with  varying  amounts  of 
noise  in  order  to  reduce  the  iikelyhood  of  word  memoriza- 
by  the  human  subjects.  To  further  reduce  the  Iikelyhood 
of  word  memorization  the  noisier  tapes  were  played  first 
during  the  intelligibility  tests. 

A  summary  of  the  intelligibility  test  results  is 
shown  in  Table  10.  The  reference  S/N  shown  in  Table  10 
corresponds  approximately  to  a  peak  voice  signal/noise 
spectral  density  of  63  dbHz.  Because  of  the  difficulty 
of  defining  accurately  the  signal  power  in  a  voice 
signal  all  values  in  Table  10  are  shown  relative  to  a 
reference . 

In  addition  to  the  overall  results  shown  in  Table 
10,  detailed  results  were  obtained  for  the  starting  and 
ending  phonemes  of  individual  words  in  order  to  permit 
the  determination  of  the  parameters  used  for  conversion 
in  the  system  from  intelligibility  measure  to  intel¬ 
ligibility  score. 

The  same  tapes  used  for  obtaining  intelligibility 
scores  with  human  listeners  were  then  run  through  the 
system  and  intelligibility  scores  were  obtained.  These 
are  tabulated  in  Table  11. 

A  graph  showing  intelligibility  scores  as  a  function 
of  signal  to  noise  ratio  for  both  human  listeners  and  the 
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Table  9.  Word  List 


click 

brass 

gob 

eye 

slice 

slush 

pack 

ace 

rouge 

cart 

rap 

in 

flash 

pad 

route 

quip 

salve 

cork 

pew 

did 

theme 

crate 

wretch 

skid 

wash 

fair 

web 

threw 

clog 

robe 

soak 

get 

seed 

joke 

wise 

duke 

hump 

lid 

walk 

gang 

beard 

puss 

tilt 

base 

judge 

roost 

mow 

souse 

sigh 

fast 
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Table  10.  Number  of  Missed  Words  as  Function  of 
Signal  to  Noise  Ratio 


No 

Noise 

Ref . 
S/N 

-3db 

-6db 

-9db 

-12 

db 

-15 

db 

-18 

db 

-21 

db 

Listener  #1 

2 

12 

17 

20 

24 

29 

35 

38 

37 

2 

4 

14 

15 

20 

24 

31 

27 

28 

29 

3 

3 

11 

11 

17 

20 

20 

27 

24 

33 

4 

2 

15 

13 

18 

20 

24 

26 

32 

31 

5 

7 

17 

17 

22 

23 

30 

33 

30 

34 

6 

6 

14 

17 

18 

25 

24 

29 

27 

28 

7 

4 

13 

15 

16 

19 

31 

31 

34 

31 

8 

6 

12 

14 

24 

21 

27 

30 

32 

36 

9 

7 

12 

15 

16 

20 

26 

28 

29 

30 

Average 

4.6  13.3 

14.9 

19  19.6  26.9 

29.6  30.4 

32.1 

Int . 

Score  (%) 

91  73 

70 

62  61  46 

41  39 
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Table  11. 

Intelligibility 

Scores  Obtained 

with 

Intelligibility 

Scoring  System 

No 

Ref . 

-12 

-15 

-18 

-21 

Noise  S/N 

-3db  -6db 

-9db  db 

db 

db 

db 

Int. 

Score  (%)  99 

70 

62  54 

44  36 

33 

29 

26 

30 


intelligibility  scoring  system  is  presented  in  Figure  7. 
The  intelligibility  scoring  system  gives  a  score  of  99 
(almost  100)  with  no  noise  while  the  listener  score  is 
91  because  human  listeners  will  misinterpret  some  words 
even  with  an  ideal  communication  system.  The  intel¬ 
ligibility  scoring  system  also  gives  a  smoother  drop-off 
with  decreasing  signal  to  noise  ratio. 

The  close  match  between  the  results  obtained  with 
the  intelligibility  scoring  system  and  those  obtained 
with  human  listeners  is  not  surprising,  because  the 
program  parameters  used  in  the  scoring  system  were 
selected  to  provide  such  a  close  match.  In  general,  the 
program  parameters  can  be  adjusted  to  provide  almost  any 
characteristics  desired. 
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4 .  RECOMMENDATIONS 

The  tests  described  in  Section  3  were  for  a 
linear  system  in  which  noise  was  simply  added.  Additional 
tests  should  be  conducted  with  non-linear  systems  such  as 
CVSD  in  order  to  establish  the  usefulness  of  the  system. 

The  system,  as  designed,  can  give  erroneous  results 
if  the  tape  recorders  used  do  not  have  very  accurate 
speed  control.  The  time  from  the  timing  code  to  the  first 
sample  measurement  is  a  function  of  tape  speed.  Some 
provision  should  be  made  for  measuring  tape  speed  through 
the  use  of  multiple  timing  codes  or  by  other  methods,  and 
using  this  measurement  to  make  software  adjustments  in 
measurement  times. 

As  described,  the  system  must  decode  every  timing 
code  in  order  to  time  to  the  sample  measurements.  It 
also  counts  the  timing  codes  in  order  to  keep  track  of 
which  word  is  currently  being  analyzed.  Should  a  timing 
code  be  missed,  the  count  would  be  in  error,  and  the 
resultant  score  would  be  meaningless.  Since  there  is, 
in  any  real  system,  some  finite  probability  that  a 
timing  code  will  be  missed,  some  provision  should  be  made 
to  estimate  the  time  of  occurrance  of  a  timing  code  if 
one  is  missed. 
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